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ABSTRACT 


Properties  of  programs  can  be  mathematically  proved.  This  report  concerns 
ttin  use  of  such  mathematical  proofs  as  a means  of  verifying  that  programs  satisfy 
thf'ir  specif ications  and  other  expectations  of  proper  behavior.  Moreover,  the 
theory  by  means  of  which  programs  are  proved  can  be  used  in  the  formal  reasoning 
needed  to  construct  and  maintain  programs.  The  primary  current  needs  are:  (1) 
expansion  of  the  theory  to  encompass  more  aspects  of  program  correctness,  (2) 
evolution  of  the  theory's  mathematical  content  and  form  to  make  it  more  effective 
in  verifying,  programs,  (3)  experimentation  with  new  and  current  techniques  for 
using  the  theory  in  verification  and  construction,  (d)  development  of  human 
knowledge  and  skills  to  fulfill  human  roles  of  specifying  and  guiding  program 
proofs,  (6)  technological  support  to  take  over  mechanical  parts  of  the  proofs  and 
follow  human  guidance  in  elaborating  them. 


1 he  needed  breakthroughs  toward  the  use  of  program  proving  as  a normal 
piogramniing  activity  arc:  (1)  a coherent  connection  with  program  testing,  (2) 
evolution  of  the  theory  to  the  point  where  significant  amounts  of  new  program 
proofs  are  adapted  or  reused  from  previous  proofs,  (3)  development  of  experimental 
methodology  for  effectively  evaluating  various  paradigms  and  techniques  for 
piogram  proving,  (d)  greatly  increased  mechanical  theorem  proving  capacity  to 
reduce  the  burden  on  human  verifiers,  (5)  large-scale  demonstrations  of  program 
proving  to  evaluate  the  validity  of  the  activity  and  to  stimulate  future  research 
and  development. 


Jho  ultimate  effects  of  program  verification  are  partly  the  intangibles  of 
deeper  understanding  of  programs  and  raising  of  standards  to  more  closely 
approximate  the  theoretical  perfectibility  of  programs.  More  tangible  effects  are 
having  formal  reasoning  methods  available  throughout  program  construction 
(especially  applied  to  software  components)  and  backed  up  by  extensive  formal 
proofs  of  final  products  where  warranted.  Proofs  are  seen  as  a necessary 
complement  to  the  experimental  verification  provided  by  testing. 
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"It  is  Of  course  important  that  some  efforts  be  made  to  verify  the 
correctness  of  assertions  that  are  made  about  a routine.  There  are 
essentially  two  types  of  method  available,  the  theoretical  and  the 
c .v  pei linental.  In  the  extreme  form  of  the  theoretical  method  a 
w atei  t i/'h t mathematical  proof  is  provided  for  the  assertion  In  the 
r X Heme  form  of  the  experimental  method  the  routine  is  tried  out  on  the 
machine  with  a variety  of  initial  conditions  and  is  pronounced  fit  if  the 
assertions  hold  in  each  case.  Doth  methods  have  their  weaknesses." 

circa  1950,  Programmers'  Handbook  for  the  Manchester 


I INTRODUCTION 


1!<  fore  look  in/*  into  the  future  of  program  verification  it  is  wortl 
ii  considering  Turing's  advice  of  nearly  three  decades  ago,  especially  regarding  th< 
v ■ i k nesses  of  the  two  extreme  positions.  The  theoretical  approach  must  deal  will 
’ 'P  f'1Ct  that  vt’ry  ,ow  mathematicians  ever  carry  out  a proof  down  to  the  Iasi 
« . tail  of  axioms  and  rules  of  inference  because  the  process  is  simply  too  exhausting 

(b°,h  WntCr  0,,d  rcndcr  al,d  is  sti»  Prone  to  error.  (Unlike  Turing,  we  can 
c m vision  the  possibility  of  mechanizing  much  of  the  detailed  proof  effort,  although 
this  is  one  of  the  ha- dost  problems  of  all.)  The  experimental  approach  must  address 
t ie  question,  "Jiy  what  sound  argument  can  you  claim  that  the  program  will 
•s.U  sly  the  assertions  when  executed  on  data  which  was  not  part  of  the 
experiment,  even  if  the  program  executed  perfectly  for  all  data  within  the 
experiment?"  The  only  possible  answer  is  another  mathematical  argument. 

‘ ur,,r,slI'P’ly*  tlicre  *«as  been  little  real  progress  in  either  discovering  or  disproving 
the  existence  of  such  arguments.) 

n /j  \VtI1  *wo  ^,nvcd  cxtrcmc  approaches,  what  does  one  do?  In  the  Programmers' 
Handbook  for  the  Manchester  Computer,  Turing  recommends  the  ever-popular,  but 

1,111  ' df 'r;k  chrck'np."--  hand  checking  conscientiously  while  summarizing  and 
organizing  via  "check  sheets."  He  also  argues  for  having  another  programmer 
complete  a reduced  version  of  the  check  sheets  and  for  deliberately  forgetting  the 
purpose  and  method  of  a routine  to  avoid  missing  program  errors  due  to 
preconceived  ideas,  lie  counsels  against  making  alterations  in  the  middle  of  the 
lout, »e  without  verifying  that  the  earlier  parts  are  unaffected,  and  recommends 
r xp  int  y checking  by  program  that  input  assumptions  arc  satisfied.  Turing  claims 
that  most  errors  will  be  found  by  such  thorough  checking,  but  also  cites  an 
example  where  the  probability  of  selecting  the  right  case  to  reveal  a particular 
e>  ror  was  2**-  1 0.  It  is  also  recommended  that  the  state  of  the  ..  achine  be  described 
by  mathematical  expressions,  in  oraer  to  convey  the  "theory  of  a routine."  His 
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overall  approach  mediates  the  two  extremes:  experimental  in  the  use  of  carefully 
selected  actual  data  and  theoretical  in  the  use  of  human  reasoning  to  meticulously 
check  for  errors  and  to  make  assertions  which  display  the  underlying  theory  and 
provide  the  guide  for  checking  the  program. 

This  quotation  and  summary  are  useful  for  starting  to  think  about  the  future 
of  program  verification  for  several  reasons: 

1.  They  show  more  common  sense  motivation,  technique,  and  caution  than 
most  modern  textbooks  or  programmer  handbooks,  which  view  verification  as 
either  testing,  or  proving,  or  which  don’t  discuss  it  at  all. 

?..  As  in  Turing's  time,  common  sense  tells  us  that  taking  any  extreme 
position  is  fraught  with  potential  disaster.  The  most  rational  course  is  moderate, 
combining  the  strengths  and  avoiding  the  weaknesses  of  the  two  methods.  To 
dramatize  the  point,  consider  your  feelings  upon  stepping  onto  an  aircraft  piloted 
by  a completely  computerized  aircraft  control  system.  How  would  you  feel  if  told 
the  software  had  never  been  tested  but  had  been  thoroughly  proved?  Or  that  it  had 
been  tested  according  to  the  latest  standards--say  at  least  by  executing  every 
.statement  and  a wide  variety  of  conjectured  conditions--but  never  exposed  to  a 
mathematical  argument  concerning  untested  conditions?  Given  the  few  current 
demonstrations  oi  practical  proving,  we  would  probably  feel  safer  about  the  latter, 
lint  if  we  Investigate  far  enough  to  discover  the  inadequacy  of  the  current  theory 
behind  testing  and  it  we  remember  the  surprises  that  continually  follow7  upon  the 
release  of  "thoroughly  tested"  software,  we  should  also  have  some  misgivings 
about  testing  alone.  Most  of  us  would  rationally  prefer  that  correctness  had  been 
strongly  argued  for  all  data  and  had  been  fully  demonstrated  for  considerable  data. 
We  might  also  demand  that  the  entire  computer  system--hardware,  software,  and 
human  opcrators--be  justified  in  terms  of  probabilistic  reliability  arguments. 

a.  Hut  for  the  sake  of  scientific  study,  we  must  choose  one  of  the  extreme 
points  and  investigate  it  thoroughly  in  its  separate  context:  develop  its  theoretical 
basis;  explore  a variety  of  paradigms  and  techniques  for  performing  its  associated 
activities;  subject  these  paradigms  to  experimental  investigation  to  evaluate  their 
feasibility,  practicality,  and  applicability  to  various  types  of  problems;  identify  and 
acquire  the  skills  and  tools  for  performing  the  experiments  and  eventually 
pursuing  the  activity  in  practice;  and  identify,  understand,  and  accept  its  overall 
strengths  and  weaknesses. 

The  present  discussion  is  meant  to  be  followed  in  the  light  of  common  sense: 
neither  proving  nor  testing  warrants  full  confidence  as  the  only  method  of 
program  verification.  Ultimately,  the  best  course  may  be  (1)  some  combination  of 
mathematical  arguments  with  testing,  (2)  application  of  one  or  the  other  methods 
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in  recognizably  acceptable  situations,  or  (3)  perhaps  the  simultaneous  performance 
of  the  two  activities  independently  but  in  moderation.  Before  any  of  these  three 
combinations  can  be  properly  investigated,  there  must  be  a greater  understanding 
of  program  testing  than  currently  exists;  ideally  we  would  like  a real  theory.  This 
further  justifies  the  study  of  program  proving  as  not  merely  our  only  current  way 
to  reason  formally  about  programs  but  also  as  a stimulus  to  a theoretical  basis  for 
testing. 

To  summarize,  in  our  context  program  verification  (hereafter  abbreviated  as 
l’V)  means  mathematical  proof  of  the  consistency  of  the  program  with  assertions 
about  it,  the  external  assertions  usually  being  called  "specifications."  The 
definition  emphasizes  consistency,  recognizing  that  specifications  must  separately 
l>e  related  to  the  full  demands  of  the  user.  However,  we  will  follow  current 
terminology  in  loosely  referring  to  consistency  as  "correctness"  and  the  associated 
activity  as  "program  proving."  The  current  view  of  the  goals  of  PV  is  to  attain  a 
high  degree  of  confidence  that  the  program  satisfies  its  specifications  relative  to 
given  semantics  of  and  assumptions  about  the  software,  hardware,  and  user 
environment  and  to  the  absence  of  errors  in  the  verification  process.  Of  course, 
this  is  no  entainty  at  all,  the  environment  being  almost  impossible  to  describe 
completely  and  the  process  being  error-prone.  The  confidence  comes  from  having  a 
theoretically  sound  and  systematic  method  for  arguing  consistency,  tools  and  skills 
for  carrying,  out  the  process,  dedication  of  sufficient  resources  for  its  completion, 
and  various  checks  lor  errors  during  the  process.  Having  all  these  stimulates  the 
improvement  of  the  environment  in  both  fact  and  description  so  as  to  reinforce  the 
confidence  gained  by  formal  reasoning. 

PV  has  an  important  payoff  besides  verification.  It  takes  a view  of  programs 
as  mathematical  objects  and  requires  a theory  about  them.  As  we  shall  see,  this 
theory  suggests  that  programs  can  be  classified  and  then  studied  by  class,  for 
example  by  task  or  technique;  shows  that  there  may  be  certain  structures  and 
relations  between  these  structures  that  generate  a large  range  of  programs 
independent  of  their  assigned  task;  rejects  programs  which  are  unduly  disorganized 
or  which  fail  to  display  their  underlying  organization;  and  begins  to  explain  the 
complexity  and  difficulty  of  programming  and  to  suggest  ways  of  surmounting 
these.  In  other  words,  l’V  demands  a thorough  and  basic  understanding  of 
pi og, rams,  including  language  and  design  principles,  that  may  ultimately  strongly 
influence  their  construction.  Many  researchers  now  view  this  theory  as  the 
a uitral  reason  for  studying  l’V,  a view  which  will  be  emphasized  in  this  report. 
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> CURRENT  PR  OBI  IMS.  OllSTACI.ES,  AND  STF.PS  TO  OVERCOME  THEM 

VVr  will  explore  two  approaches  to  understanding  the  current  state  of  PV:  ( 1) 
an  enumeration  classified  as  theory,  technique,  people,  and  technology  and  (2)  a 
historical  and  psychological  analysis  which  reveals  sources  of  confusion,  tension, 
and  conflict  which  block  a clear  view  of  what  is  happening  and  what  should 
happen.  Other  surveys  of  program  verification  arc  [London77a,77c]  and 
[ laic  khain77]. 

The  mathematical  theory  of  programs  is  multilevel.  A first  level  relates 
statement  and  declaration  language  constructs  to  predicates.  All  current 
formalisms  look  something  like  that  popularized  in  [l)ijkstra7G]  as  the  "predicate 
transformer"  wlp(S.Q),  the  "weakest  liberal  precondition  for  a state  to  satisfy  in 
olde  r that  the  predicate  Q be  satisfied  for  the  state  resulting  from  executing  the 
statement  S."  This  gives  l’:>wlp(G,Q)  as  the  correctness  theorem  for  a program  G 
with  specifications  P for  input  and  Q for  output.  A second  level  deals  with  the 
ex  pressions  of  the  language,  requiring  some  axiomatization  of  the  data  types  of  the 
language.  A third  }r  vel  is  the  assertion  language,  which  is  usually  a superset  of  the 
expression  language.  Cutting  across  all  three  levels  is  some  language  of  logic, 
including  means  for  expressing  quantification,  eq  mlity,  implication,  etc.  For 
example,  denoting  the  predicate  transformers  for  the  conditional  and  assignment 
statements  ~>y 


wlp(if  11  then  S fi.  Q)=(ll  =>  wlp(S.Q))  a (~B  a 0) 
wlp(V:=E,  Q)=Q[E  substituted  for  free  occurrences  of  V] 

and  letting  t be  a binary  tree  and  S a stack, 

wlpfif  Hight(t)/nil  then  S:=PushStack(S,Right(t))  fi, 
NoNilNodesOnStack(S))= 

=(Hight(t)/nil  o NoNilNodesOnStack(PushStack(S,Right(t)))) 
A(lhght(t)=nil  :>  NoNilNodcsOnStack(S)) 
where  NoNilNodesOnStack(S)  is  some  predicate  defined  for  stacks. 

These  are  the  types  of  theorems,  sometimes  called  verification  conditions  or  verification 
lemmas,  that  are  usually  subjected  to  detailed  p. oofs  in  current  techniques. 

Another  level  of  theory  deals  with  the  way  data  types  and  statement  level 
theories  should  be  organized  to  facilitate  efficient  theorem  proving,  which  we  will 
discuss  further  under  technology.  Of  course,  any  individual  program  proof  also 
requires  the  theory  associated  with  its  problem  domain. 


1 are  1,l,11,y  trchnical  issues  dealing  with  the  theory.  The  kinds  of 
correct  ness  which  c.m  bo  addressed  range  from  "partial  correctness,"  which  ignores 
l<  i in i nation  questions,  through  various  degrees  of  termination,  e.g.,  "cleanness"  in 
not  aborting  on  i.execulable  operations  and/or  "nonlooping,"  to  inclusion  of 
p or  in  auce  constraints.  Current  theories  lean  toward  partial  correctness  with  the 

quo- Hon  of  nonlooping  handled  by  bounded  counters  or  well-ordered  functions 
( b’.ni ness  is  addressed  by  treating  the  conditions  for  proper  execution  of  operations 
"k"  U,P  inp,,t  reifications  of  functions,  hut  exceptional  conditions,  including 
ei  iors,  in  user  input,  other  software  components,  and  even  hardware  are  aspects  of 
( oi  ier  tness  that  have  yet  to  be  adequately  addressed. 

7 Inoretical Problem  / 7 heory  must  address  requirements  far  wider  than  "partial 

correctness,  including  recognition  and  handling  of  abnormal  conditions. 

n.is  is  not  simply  a verification  problem,  since  languages  and  methodology  have 
only  recently  considered  it  seriously.  Correctness  theory  may  be  especially  useful 
iu  guaranteeing  robustness  and  handling  exceptions,  since  it  already  formally 
describes  how  properties  are  affected  as  they  flow  through  the  program,  including 
properties  relating  to  exceptions.  Steps  in  this  direct, on  are  [Levin??]  on  exception 
handling,  [hratt//]  on  separating  various  nondeterministic  termination  issues 
Lampson//]  with  "legality  assertions"  for  proper  execution  of  expressions,  and 
[Suzuki  i i]  on  checking  array  bounds. 


Any  good  theory  must  have  the  qualities  of  "soundness,"  i.e.,  that  all  theoiems 
am  valid  in  the  desired  sense,  and  "completeness,"  i.e.,  that  all  true  and  interesting 
•s  ate, ue nts  can  be  proved.  For  a correctness  theory,  the  question  is  whether  the 
irory  fully  and  accurately  captures  the  semantics  of  the  associated  language.  In 
i Is  re.. poet  great  progress  has  been  mode  In  developing  a general  theory  of 
.semantics  as  the  foundation  for  program  proving  [dellakker77]  and  in  considerable 
n orma  experimentation  with  proof  rules  for  various  languages  and  individual 
language  constructs  [Hoare?3,  London?7b].  However,  current  languages  have 
complicated  proof  rules  which  arc  hard  to  justify  semantically. 


1 hroretica!  Problem  2 ■ Current  theory  must  be  stretched  to  include  more  features 
j real  languages  and,  concurrently,  languages  must  be  designed  which  admit 
reasonably  simple  and  complete  semantic  descriptions. 

Jovial  is  addressed  successfully  in  [Klspas??],  concurrent  programming  constructs 
lorn  at,  zed  in  [Owicki?G],  pointers  and  records  are  handled  m [Luckham76l, 
am  microprograms  of  significant  complexity  are  handled  in  [Carter77].  New 
languages  with  a full  complement  of  necessary  features  arc  being  developed  in 
conjunction  with  some  degree  of  formalized  semantics  and  goals  of  verifiability 
[Ambler??,  Lampson??,  Liskov??,  Wulf?G],  but  there  will  be  a gap  for  several 
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years  between  theory  and  current  languages,  c.g.,  PL/1,  BLISS,  etc.  It  is  unlikely 
that  this  "clean-up"  operation  n languages  would  have  occurred  as  urgently 
without  the  motivation  and  formalism  of  PV.  However,  it  may  be  that  proving  the 
language-independent  aspects  of  algorithms  and  data  structures  will  achieve  most 
of  the  benefits  of  verification  and  that  language-dependent  questions  may  be 
resolved  by  proving  equivalence  of  programs  or  by  relying  on  compiler-like 
chocks. 

One  of  the  der , -nd  most  diii.'cult  parts  of  mathematics  is  mathematical 
induction,  which  a lows  reasoning  about  the  structure  and  properties  of 
potentially  infinite  objects.  Correctness  theory  started  off  wi*h  the  notion  of 
"invariants,"  assertions  that  hold  at  every  iteration  of  a loop.  Another  early 
inductive  form  of  argument  was  based  on  the  "structure"  of  objects  [BurstallG9]: 
assuming  a property  for  its  components,  the  property  was  proved  for  an  arbitrary 
instance  of  a well-defined  object.  More  recent  variations  are  subgoal  induction 
[ Morris'// 1,  whic  h reasons  with  invariants  both  forward  and  backward,  and 
intermittent  assertions  |Manna7G],  which  reason  from  one  iteration  of  a loop  to 
some  future,  but  not  necessarily  the  next,  iteration.  The  question  is  whether  it 
inaki  ,s  any  difference  which  inductive  method  is  chosen  and,  if  so,  when.  As  in 
traditional  mathematics,  it  is  critically  important  to  get  the  statement  of  the 
theorem  exactly  right  for  an  induction  argument,  after  which  the  argument 
usually  goes  through  smoothly. 

T heretical  Problem  3.  More  insight  is  needed  into  the  basic  nature  of  inductive 
arguments  in  program  proving . when  to  use  which  type  of  argument  and  how  to 
formulate  the  inductive  assertions. 

Some  work  has  been  done  on  mechanical  generation  of  inductive  sssertions  from 
programs  and  from  specifications  [Wegb.reitV4],  hut  the  problem  is  so  hard  that 
thr.se  approaches  may  work  only  in  very  simple  instances.  Anotf  er  approach  is  to 
ac  cumulate  inductive  assertions  during  program  derivations  as  they  are  made  in  a 
specific  context  for  a specific  purpose  at  a useful  level  of  abstraction,  rather  than 
nl  1 at  the  end. 

The  predicate  calculus  is  not  the  most  understandable  language,  either  to 
casual  readers  or  f luent  mathematicians. 

Theoretical  Problem  T Ways  must  be  found  to  improve  the  cxpressibUity  and 
readability  of  formalisms,  both  at  the  semantic  and  at  the  assertion  level. 

The  algebraic  approach  |Guttag76,  Burstall77bJ  promises  some  alleviation  by 
orienting  reasoning  toward  equalities.  The  key  notion  may  be  simply  the  right 
notation  for  the  concepts  of  interest. 
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Mathematicians  never  use  a theory  at  its  axiomatic  level  for  very  long. 
Instead,  they  develop  general  aiH  leading  theorems  which  express  the  important 
properties  of  the  concepts  of  the  theory  and  which  suggest  new  variations  and 
derivatives  of  these.  Program  proving  is  currently  like  reading  the  first  chapter  of 
a mathematics  ti  »t  to  find  the  axioms  and  basic  definitions  and  then  trying  to  do 
the  hard  exercises  at  the  end  of  the  book,  because  the  intermediate  theory  is 
missing,  each  exercise  requires  building  up  intermediate  concepts  and  techniques 
or  brute  force  applications  of  axioms. 

7 heoietical  1'iobtem  5.  The  theory  of  programs  based  on  correctness  properties 

must  evolve  to  higher  levels  of  generality  in  both  theorems  and  techniques 

An  active  area  of  theory  building,  at  least  at  the  axiomatic  level  for  data  structures, 
is  (GuttagYGj,  leading  possibly  to  an  overall  theory  of  the  structure  of  such 
axiomatic  theories  | Jlurstall77b].  Further  examples  of  this  higher  level  theory 
will  he  given  shortly.  While  not  yet  highly  visible,  this  type  of  theory  is  the  key 
to  making  l'V  follow  the  successful  route  of  mathematics,  namely  pursuit  of 
stimulating  and  general  statements  about  the  objects  (here  programs)  of  interest. 

Turning  from  theory  to  those  techniques  that  use  the  theory  for  proving 
programs,  the  basic  question  is  what  theorems  to  prove.  The  current  paradigm  has 
two  disjoint  phases:  (1)  to  completely  transform  assertions  into  predicates 
containing  only  expressions,  not  statements,  from  the  program  and  then  (2)  to 
prove  these  predicates.  This  paradigm,  though  conceptually  simple,  has  numerous 
practical  difficulties-  the  generated  predicates  may  lose  all  traces  of  structure  from 
the  program,  thereby  foiling  attempts  to  find  a proof  of  them  based  on  reasoning 
about  the  program;  recovery  from  an  error,  no  matter  how  minor,  in  either 
program  or  assertions  is  almost  impossible  without  a complete  restart;  and  the 
"calculus  of  programs"  cannot  be  brought  into  play.  This  calculus  consists  of 
derived  properties  of  predicate  transformers,  e.g., 

wlp(S,AAll)=wlp(S,A)Awlp(S,B), 

(l’’wlp(S,Q))=(sp(S,P)^Q), 

where  sp  is  a "strongest  postcondition"  transformer. 

These  properties  allow  considerably  more  flexibility  in  organizing  proofs  and  are 
perfectly  legal  ways  of  proving  programs,  although  excluded  by  the  two-stage 
paradigm. 

it  is  also  possible  to  prove  concrete  programs  correct  by  virtue  of  their  being 
instances  of  more  abstract  programs.  The  appendix  contains  an  iterative  version  of 
a program  schema  that  is  proved  to  compute  a recursive  function,  stated  in  terms  of 
abstract  operations  F,  hO,  hi,  h2,  h3,  G,  where  only  the  definition  of  F and  the 
associativity  of  G are  known.  This  schema  can  be  instantiated  to  variations  of  tree 
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traversal  and  perhaps  to  other  functions.  One  proof  at  the  schema  level  is  used  for 
vvo  proofs  at  the  concrete  level  with  the  further  gain  that  schema  level  proofs, 
from  the  distraction  of  concrete  operations,  are  easier  to  find  and  understand. 

At  another  level  are  theorems  which  allow  correctness  to  be  transferred  from 
one  program  to  another  without  complete  reproof  of  the  second  program.  A Context 
J neorrni  informally  says  that  the  correctness  of  an  entire  program  is  equivalent  to 
iMovinr.  the  correctness  of  every  statement  of  the  program  in  a "context"  which 

noZndHin  • P0SSiWC  Prccon<*itionM  and  "weakest  necessary 

. run  Thi  TC  nl,VC  t0  thC  spcciflcaUons  and  assertions  in  the  rest  of  the 
' i S;,J’P°rtS  3 i,Cplaccn,ent  Theorem  which  informally  says  that  one 

wn  he  T * rePl3CCd  by  a”0thcr  S‘  ln  a pic*ram  G ^e  a program  G'  that 
. COrrCCt  11  0 Was  corrcct  and  ^ S'  is  correct  in  the  context  of  S within  G The 

c.rPnrnof  r"!S|C!ln.0ntn  **  f’CnCraliZcd  and  partial,y  Proved,  leaving  just  a residue 
Hoof  related  to  the  problem  domain.  These  theorems  lead  to  a paradigm  for 

ZiUlc  ^ SUCCCSSiVG  transferral  al°"*  a of  replacements;  it 

I>,  d t r appendix.  The  "correctness-preserving  transformation" 

activciy  pursucd  in>  an,°nr>  °thers>  unman??*, 

lGerhu  tVHl0thlC^t,^rCn,  SUPP°rtS  lh°  l,SC  °f  GhOSt  (or  auxiliary)  Variables 
t -1,  Cb  i,ave  no  influence  on  the  result  of  the  program,  but  which  are 

CXprc<M  h,Storics*  Uscd-«P  values,  or  missing  abstractions.  The  theorem 
*u.V;S  that  a pi  ograin  proved  correct  using  these  as  regular  program  variables  is  still 

cn  mi  \?  T V‘iri,lblCS  arC  dClCted  fr°m  thC  pro«ram  and  aPpcar  existentially 

t o T , f aWCrt,°nS  °n  thC  Pr°VCd  Pn*ram‘  Schcnias  °f ion  use  an  abstract 
■ a object  from  which  some  other  value  or  data  object  is  computed  in  an 

-tarnation  of  the  schema,  after  which  the  abstraction  is  removed.  In  the 
ai-Pendix  example,  the  tree  traversal  computes  a list  of  nodes  of  the  tree  upon 
» c i a count  is  made.  Only  the  count  is  of  interest,  so  the  nodelist  is  deleted. 

Ihl'roachn u'.u'i"1  tn'°staS‘  generate  and- prove  verification  lemmas 

tn  > 1 bC  Tc  axc(i  tp  admit  natural  higher  level  proof  techniques  and 

I t f i out  d programs.  7 he  variant  paradigms  must  be  clearly  for  mulated 
formally  jusnVed.  and  subjected  to  experimentation.  f "mutated. 

Iffol^h,,  r,>Nri  \ PtPViHS  m°rC  gCVCra/  thC0rcms  rc(iuca  0V(raU  Proof 

PL«:f,!f:;PzyS’  “sef  ,hmcms  and *•" m m "«*' kv,h 

Our  example  illustrates  these  techniques  and  their  problems.  The  data  structure 
ab  traction  methodology  addresses  these  problems,  but  the  full  range  of  paradigms 
and  theorem  power  has  not  yet  been  reached. 
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There  are  some  programming  languages  which  emphasize  functions,  e.g., 
] I si  l.ut  most  1, mr.uap.es  favor  iteration  as  the  main  form  of  looping  and 
sequencing  of  statements  as  the  main  lorm  of  composition.  Some  experience  shows 
that  it  is  much  easier  to  prove  properties  about  functions  because  they  generate 
modularity,  discourage  unnecessary  sequencing  by  parallel  evaluation  of 
arguments  to  function  calls,  and  direct  attention  toward  the  objects  being 
computed.  There  is  no  loss  in  practice  if  the  proved  functions  can  be  translated  to 
iteration  and  the  proofs  can  be  transferred. 

Technique  Problem  3:  Which  is  better,  function  or  iteration  representation  of 
programs?  flow  can  they  be  used  interchangeably ? 

The  argument  lor  functions  is  given  by  [MannaV?,  Boyer75],  while  [I)ijkstra76] 
argues  for  iteration. 

Given  the  fallibility  of  programmers  and  provers,  an  interesting  problem  is 

Technique  Problem  4:  Now  much  of  a proof  is  still  valid  after  certain  types  of 
modi f nations  to  the  program  or  to  assertions ? 

The  only  work  in  this  area  so  far  is  [Moriconi77]. 

Constantly  plaguing  those  who  publish  in  PV  and  try  to  write  proofs  is 

Technique  Problem  5:  Now  does  one  present  a proof  with  sufficient  structure 
that  remaining  details  can  be  filled  in.  but  without  so  much  detail  that  readers 
are  over  whelmed ? 

[Wcgbreit77|  suggests  various  means  of  "justifying"  proofs  as  normal  program 
additions  and  annotations. 

Programs  are  created  by  people  who  should  know  enough  about  them  to 
create  proofs,  lint  this  is  not  so  when  rigorous  mathematical  standards  are  applied, 
or  even  when  loose  informal  arguments  are  acceptable.  Creating  a proof  requires 
the  ability  to  design  notation  that  captures  the  basic  concepts,  knowledge  of  rules 
of  logic  for  sequencing  steps  of  a proof,  and  an  understanding  of  the  axioms  that 
describe  the  objects  of  interest.  A basic  grounding  in  mathematical  knowledge  is 
inescapable,  but  also  necessary  is  the  creative  component  which  is  not  taught  in 
elementary  math  courses. 

People  Problem  1 Potential  program  provers  must  not  only  be  taught 
mathematical  facts  and  techniques,  but  must  also  be  led  to  develop  their 
manipulative  skills  and  creative  powers. 


10 


[Dijkstra74]  presents  an  enlightening  discussion  of  this  problem. 

Any  new  formalism  or  formal  discipline  is  criticized  as  being  an  unnecessary 
academicians’  toy  for  which  no  practical  use  can  be  foreseen  and  which  is  just  too 
complicated  for  ordinary  people  to  understand.  BNF  was  probably  seen  in  this 
light,  especially  when  followed  by  a flurry  of  papers  about  syntax  and  parsing. 
Yet  it  is  now  widely  accepted  and  taught  without  mystery,  its  use  having  been 
found  and  separated  from  the  formalism  which  refined  it. 

People  Problem  ?.:  The  rebellion  against  formalism  and  the  excess  of  formalism 
in  early  stages  must  be  accepted  as  normal. 

It  may  take  a while  for  people  to  accept  the  fact  that  "programming  is  a discipline 
of  a mathematical  nature"  [l)ijkstra74]. 


p log  rains  can  be  proved  without  any  type  of  mechanical  support,  just  as 
mathematicians  have  proved  theorems  for  hundreds  of  years.  But  program  proving 
differs  in  that  it  requires  many  small  and  not  always  interesting  theorems.  As 
argued  and  illustrated  above,  there  are  metatheorems  which  support  interesting 
proof  techniques  and  general  theorems  which  implicitly  define  interesting  classes 
of  programs,  hut  at  the  concrete  level  there  are  always  many  little  problem  domain 
farts  and  tricky  chains  of  reasoning  needed  to  glue  together  a proof.  Some 
computer  assistance  is  needed  both  in  handling  domains  where  people  don  t think 
well  (e.g.,  integer  arithmetic  and  chains  of  inequalities)  and  in  making  sure  every 
step  of  a proof  is  legitimate.  Some  initial  capability  in  this  area  exists  in  present  PV 
systems,  but  theie  is  no  overall  theory  of  how  to  handle  either  the  large  number  of 
disparate  domains  (basically  one  per  program  data  type)  or  the  extremely  large 
search  space  that  can  be  generated  in  finding  proof  steps. 

Technological  Problem  I It  appears  necessary  to  find  specific  domains  that  are 
highly  use fn/  in  PV  and  concentrate  on  increasing  potency  within  them.  But  an 
overall  theory  of  handling  these  domains  is  still  needed. 

Steps  in  these  directions  are  [Nelson77]  with  coordinated  fast  simplifiers, 
[ I. ,inkford77  | on  the  properties  and  use  of  rewrite  rule  systems,  and  [Boyer77b]  on 
the  use  of  lemmas  and  induction  generalization  techniques. 

As  mentioned  earlier,  the  generate-and-prove  paradigm  is  conceptually  clean 
and  the  generate  phase  is  easily  implemented,  provided  the  language  semantics  are 
given  and  clean.  But  the  paradigm  fails  in  practice,  and  though  newer  ones  are 
known,  they  are  not  yet  implemented. 
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Technological  Problem  2 PP  systems  must  be  extended  to  permit  use  of  the 
''calculus  of  programs,"  transfcrral  of  correctness  between  programs,  and 
hierarchical  development  of  programs. 

TJirory  and  implementations  of  some  of  these  ideas  appear  in  [Good77,  Elspas77, 
MusserW,  GerhartVG]. 

l’V  systems  are  systems  in  the  true  sense,  containing  the  semantic  analogs  of 
compilers  in  predicate  transformers,  multiple  theorem  provers  working  in 
cooperation  and  possibly  in  parallel,  input  and  output  routines  to  manage  data  in 
user  terms,  data  bases  of  previously  proved  formulae,  bookkeeping  for  the  status  of 
proofs,  etc.  As  such  they  suffer  the  usual  trauma  of  complexity  and  bear  the 
additional  burden  that  proofs  must  proceed  interactively  without  confusing  or 
boring  the  user.  Neither  humans  nor  systems  can  manage  the  task  of  PV  alone;  the 
effort  must  be  cooperative  and  synergistic. 

Technological  Problem  3.  PP  systems  must  be  made  habitable  before  they  can  be 
experimented  with  seriously,  let  alone  be  put  into  production  use. 

Of  course,  there  is  the  fundamental  dilemma;  llow  do  you  know  the  verifier  is 
correct? 


Technological  Problem  4:  PP  systems  must  be  constructed  so  clearly  and  so  well 
that  their  correctness  can  be  accepted  after  sufficient  periods  of  reliable  usage. 

Having  looked  at  various  problems  and  steps  regarding  the  theory, 
techniques,  people  and  technology  of  program  proving,  we  will  now  explore  our 
second  approach.  While  not  the  type  of  pure  scientific  analysis  we  might  like  to 
see  in  a paper  on  the  future  of  a scientific  activity,  the  following  historical  and 
psychological  analysis  is  still  important  in  determining  the  course  of  the  field. 
He, searchers  are  seldom  as  objective  as  they  might  like,  but--as  in  everyday  life — 
find  it  hard  to  identify  all  the  determinants  of  their  actions.  Understanding  the 
trends  in  human  terms  may  make  it  possible  to  break  out  of  ruts  and  make  better 
intermediate  decisions  about  what  problems  to  tackle  next. 

In  the  development  of  a theory  , there  is  first  a pre-theory  stage  where  some 
activity  goes  on  guided  only  by  intuition  and  common  sense  until  someone 
formulates  axioms  and  rules  of  inference  which  provide  the  language  and 
re  i oiling  mechanisms  for  discussing  the  activity  and  which  generate  the  true 
statements  of  interest.  Next  some  useful  and  general  theorems  are  proved  from 
wliirh  more  specific,  interesting  theorems  can  be  proved  without  recourse  to  the 
axioms.  Concurrently  comes  the  development  of  proof  techl  ques  that  telescope 
pi  oof  steps  or  systematize  reasoning.  Some  time  later  key  theorems  arc  recognized, 


<md  proofs  begin  to  br  organized  around  certain  similar  themes  that  eventually 
become  standardized  throughout  education  and  research.  Finally,  some  new 
problems  or  insights  generate  doubts  and  major  revisions. 

lb’l.ited  to  this  progression  is  the  way  we  view  objects  and  the  styles  of 
manipulation  performed  upon  them.  First  there  is  the  concrete  view  and  "blind," 
ad  hoc  manipulation.  Then  certain  patterns  evolve  which  cover  most  situations 
<uul  lead  to  disciplines,  still  at  the  concrete  level.  With  sufficient  understanding  of 
patterns  of  manipulation  and  of  similarity  of  structures  comes  the  ability  to 
generalize  from  details  and  thus  reach  more  abstract  levels.  As  abstraction  becomes 
more  widely  appreciated,  it  comes  to  be  used  before  details  are  considered,  though 
indecision,  lack  of  clarity,  and  old  habits  of  thought  make  its  use  inconsistent, 
liventually  the  emergence  of  patterns  of  abstraction  lends  to  disciplines  and 
ultimately  to  some  standardization. 

From  yet  another  viewpoint,  consider  the  way  problems  emerge  and  are  dealt 
with.  Problems  may  be  recognized  informally  for  a long  time  before  the  right 
concepts  and  terms  are  found  to  describe  them.  Then  comes  the  formalization  of 
former  informal  techniques  as  well  as  new  techniques  which  follow  logically 
fiom  the  newly  recognized  concepts.  These  arc  explored  haphazardly  until  some 
sd  of  critical  questions  can  be  posed  and  some  experimental  methodology 
developed.  With  experimentation,  strengths  and  weaknesses  and  areas  of 
applications  are  clarified.  Somewhere  along  the  line,  techniques  become  useful 
enough  that  dissemination  occurs  throughout  and  even  outside  of  the  research 
community,  l'.ventually,  a few  standard  approaches  emerge,  but  by  then  new 
problems  are  being  recognized  relative  to  old  problems,  the  new  solutions,  or 
ex  ternally  generated  new  problems. 

The  overall  effect  of  these  progressions  is  considerable  confusion,  tension, 
frustration,  and  conflict.  In  the  computer  field,  these  are  compounded  hy  the 
rapidity  of  developments  and  by  economic  and  social  pressures  to  provide  nearly 
instantaneous  solutions  to  barely  understood  and  grossly  underestimated  problems. 
Consider  some  observations  about  the  current  problems  of  PV  in  the  light  of  these 
progressions: 

1.  In  programming  methodology,  V toward  abstraction  occurred  at 

almost  the  same  time  as  the  move  towa  com  te  discipline.  On  the  one  hand, 
people  believed  that  sticking  with  three  mental  control  structures  would 

solve  many  problems,  while  they  were  being  told  that  abstraction  (whatever  that 
was)  was  the  way  to  manage  complexity.  There  was  the  conflict  between 
discipline  at  one  level  and  ad  hocness  at  another.  PV  is  at  a similar  transition  stage: 
some  gain  has  been  made  at  the  concrete  level,  while  abstraction  promises  much 
more,  at  the  temporary  sacrifice  of  familiarity  and  some  acquired  discipline. 
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?..  When  trying  to  apply  a new  theory  which  exists  only  at  the  axiom  level, 
the  practicing  prover  soon  notices  that  many  mathematical  properties  and 
techniques  got  rediscovered  or  redone  and  that  starting  from  scratch  on  every 
problem  is  excruciatingly  slow  and  painful,  as  well  as  stupid.  But,  if  provers  start 
consciously  developing  the  theory  in  a systematic  fashion,  the  amount  of  notation, 
theorems,  and  techniques  soon  becomes  disorganized  and  unmanageable  (witness 
the  size  and  complexity  of  mathematics  books).  Furthermore,  there  are  still  so 
many  programs  to  be  proved  that  each  new  one  can  find  very  little  well  organized 
old  material  to  lean  on.  Our  earlier  analogy  with  proving  the  hard  exercises  at  the 
end  of  the  book  given  only  the  first  chapter  encapsulates  the  dilemma  of  PV,  which 
only  time  and  concentrated  effort  can  cure.  Notice  that  the  above-mentioned 
theorems  are  really  lormalizations  of  common  informal  reasoning. 

3.  Belated  to  the  previous  observation  is  doubt  about  the  ability  to  scale  up 
from  the  small  and  simple  problems  used  to  develop  theory  and  techniques  to  the 
messy  real  problems.  At  some  point,  a great  deal  of  effort  must  be  expended  in  just 
this  scaling  up,  forcing  theory  development  and  introducing  new  problems  of  size 
and  interaction  complexity. 

d.  Continually  disturbing  is  the  "overscll/ovcrbuy"  phenomenon.  The  new 
solutions  look  more  promising  than  the  old  ones  before  there  is  adequate 
recognition  of  the  new  problems  that  will  accompany  the  new  solutions.  PV  has 
been  accepted  by  large  portions  of  the  research  world  as  a major  theme  for  viewing 
and  attacking  all  kinds  of  problems.  At  least  some  significant  portion  of  the 
practicing  programming  world  recognizes  its  existence  and  uses  it  in  some  diluted 
form  or  is  affected  by  such  byproducts  as  newer  and  cleaner  languages.  But  it’ is 
difficult  to  measure  the  value  of  these  byproducts  and  attitude  changes  relative  to 
the  ultimate  (but  as  yet  unachieved)  goal. 

f>.  It  takes  a long  time  to  develop  adequate  experimental  methods  to  evaluate 
various  proposed  solutions  or  paradigms,  not  to  mention  the  time  it  takes  to 
perform  the  experiments.  Consequently,  the  evaluations  are  hasty,  unquantified, 
and  subjective.  Therefore,  there  is  always  further  doubt  about  validity  and 
usefulness  which  never  has  time  to  be  either  objectively  dispelled  or  supported. 
PV,  as  well  as  many  other  parts  of  computer  science,  lacks  the  experimental 
methodology  for  conscientiously  evaluating  its  competing  methods  and  validating 
its  claims. 

0.  Technology  for  assisting  manual  techniques  always  lags  far  behind 
because  methods  must  be  shown  to  have  some  theoretical  validity  and  be  somewhat 
formalized  before  they  can  be  incorporated  into  languages  which  can  then  be 
meshed  with  previous  technology.  Furthermore,  it  simply  takes  a long  time  to 
devr  )p  comfortable  and  reliable  technological  suppoit.  This  also  goes  along  with 
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the  bottleneck  of  experimentation:  in  order  to  do  any  large  scale  experimentation, 
technology  is  necessary,  but  it  is  difficult  to  know  how  to  direct  the  technology 
before  usefulness  has  been  demonstrated.  In  PV,  the  gap  between  methodology  and 
technology  must  be  filled  by  languages,  although  some  new  paradigms  may  be 
explored  at  the  concrete  level  and  with  conventions  for  current  languages. 

V.  One  further  complication  is  that  as  soon  as  some  solution  looks  promising, 
its  earliest  version  is  disseminated.  Of  course,  this  early  version  has  difficulties 
which  cause  frustrations  and  doubt;  even  if  it  gets  mastered,  ingrained  habits  and 
inertia  may  cause  improved  methods  to  be  rejected  as  they  come  along.  The 
invariant  assertion  method  associated  with  full  verification  condition  generation 
on  concrete  programs  has  now  been  widely  written  up  and  is  frequently  taught. 
While  this  conveys  some  of  the  fundamental  ideas  of  PV,  the  inflexibility  and 
distasteful  separation  from  program  construction  make  it  unduly  difficult  in 
practice  and  misrepresent  the  difficulties  of  PV. 

0.  .Since  everyone  learned  programming  early  in  her  (or  his)  computer 
science  career  and  usually  got  along  well  enough  fc  several  years  with  that 
amount  of  knowledge,  there  is  reluctance  to  accept  the  unpleasant  fact  that  there  is 
always  so  much  more  to  learn,  especially  when  it  means  breaking  years  of  habit. 
l'V  confronts  programmers  with  their  lack  of  understanding  of  programs  and  their 
inability  to  express  what  they  do  understand.  Sometimes  the  challenge  is  accepted 
and  proving  is  mastered,  but  sometimes  the  frustration  at  not  understanding  gets 
displaced  to  proving  rather  than  programming. 

These  eight  points  all  relate  to  the  compression  of  many  stages  of  research  and 
development  into  a very  short  period  of  time  and  to  our  basic  human  tendencies  to 
expect  too  much  too  soon  for  too  little  effort.  While  not  "solvable"  in  any  sense, 
they  can  explain  problems  which  must  be  endured  without  distraction  from  the 
ultimate  goals. 

3.  NEEDED  ADVANCES  AND  BREAKTHROUGHS 

Breakthrough  I A theory  which  unifies  testing  and  proving  or  selects  between 

them. 

Since  ultimately  verification  will  consist  of  some  combination  of  testing  and 
proving,  it  would  greatly  help  both  camps  to  have  better  perspective.  The 
breakthrough  will  probably  have  to  come  from  the  testing  end  of  verification, 
because  program  proving  researchers  see  their  approach,  which  is  based  on  a 
rapidly  developing  mathematical  theory,  as  more  promising.  The  considerable 
amount  of  research  on  sampling  and  probabilistic  testing  has  neither  convinced  nor 
interested  the  program  proving  community.  The  cost  of  efforts  in  this  direction 
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could  bo  small,  since  the  breakthrough  would  most  likely  come  from  some 
ingenious  twist  on  current  research  to  yield  the  critical  insight.  Some  unifying 
stops  are  [Goodenough77,  Ilowden77j.  which  emphasize  errors  and  their  relation  to 
both  testing  and  proving.  The  important  factor  is  that  researchers  be  receptive  to, 
rather  than  at  war  with  (or,  even  worse,  ignoring)  the  other  camp. 

11  teak  though  2:  A significant  increase  in  the  power  of  mechanical  theorem 

provers. 

Technological  improvements  in  speed  and  capacity  of  computers  will  help,  but  the 
real  need  is  for  a theory  that  unifies  various  strategies  in  such  a way  that  small  but 
important  domains  can  be  well  handled.  This  is  one  area  where  optimistic 
projections  for  full  mastery  of  the  type  of  theorem  proving  we  want  today,  namely 
interactive  guidance  by  user-supplied  strategies  through  fully  mechanized 
subproofs,  are  ten  years,  with  full  capabilities  for  finding  proofs,  although  not 
necessarily  finding  interesting  theorems,  in  thirty  years.  There  are  really  only  two 
schools  of  present  theorem  proving:  the  resolutionists,  who  haven't  yet  seriously 
considered  l‘V  applications,  and  the  nonresolutionists  [lllcdsoe74],  v ho  have  made 
a major,  but  ad  hoc.  attack  on  PV-relatcd  problems.  Considerably  more  research 
f unding  could  go  to  this  area,  but  there  are  not  yet  many  researchers  capable  of,  or 
interested  in,  attacking  such  a hard,  long-term  problem.  It  requires  a unique 
combination  of  mathematical  and  programming  depth  of  knowledge  and 
experience.  A clean  theory  of  mechanical  theorem  proving  will  require  good 
implementation  to  be  effective,  while  good  programming  skills  must  be  backed  up 
by  sound  theory  and  deep  insight.  However,  mechanical  theorem  proving  is  not 
necessary  to  achieve  many  of  the  benefits  of  program  proving;  it  is  necessary  only 
for  the  highest  attainable  degree  of  certainty. 

fit eak though  9:  More  than  one  large-scale  demonstration  of  PV. 

liven  as  a combination  of  manual  and  mechanical  proofs,  such  demonstrations 
would  command  attention  and  give  momentum  to  PV.  Given  the  current  state  of 
technology,  skills,  and  basic  knowledge,  this  seems  possible  with  sufficient 
dedication  of  resources.  The  number  of  snags  will  probably  be  large,  but  these  will 
suggest  many  new  and  interesting  research  and  development  tasks.  A failure  can 
be  attributed  either  to  basic  flaws  in  the  approach  or  to  underestimation  and 
undercommitment  of  resources,  either  of  which  would  still  provide  important 
impetus  in  the  direction  of  verification  efforts.  The  costs  are  the  dedication  of 
sufficient  human  and  computer  time  and  energy,  perhaps  at  the  expense  of  other 
theoretical  and  practical  developments. 

lltcakthrough  4:  Development  of  sound  experimental  methodology  for  evaluating 

various  PV  paradigms. 
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S u eli  methodology  could  clarify  and  speed  up  the  evaluation  of  vague  paradigms 
This  might  be  nothing  more  than  the  amount  of  real  and  computer  time  required  to 
push  through  a variety  of  examples  on  an  existing  system,  together  with  the 
quantity  of  transfused  knowledge  and  strategic  direction  provided  by  users. 
Unfortunately,  current  systems  are  not  powerful  enough  to  handle  multiple 
paradigms,  and  cioss-comparisons  between  systems  would  be  difficult. 
Nevertheless,  even  i ( the  experiments  cannot  be  performed  for  lack  of  apparatus, 
the  formulation  of  the  experiments  and  their  limited  manual  application  could  be 
valuable.  The  cost  of  such  an  effort  would  primarily  be  associated  with  some 
coordinating  body  which  selected  paradigms,  designed  experiments,  and  evaluated 
results  from  several  projects. 

breakthrough  b The  accumulation  of  sufficient  theoretical  results  to  reach  a 
nitical  mass  where  new  program  proofs  can  reuse  significant  portions  of 
previous  proofs. 

Program  proofs  which  start  from  scratch,  as  currently,  will  be  prohibitively 
expensive,  if  not  completely  unmanageable.  The  main  problem  is  translating 
current  Informal  knowledge  about  programs  into  theoretical  terms  and  organizing 
this  mass  of  knowledge  so  that  it  may  be  studied  and  mechanically  accessed.  This 
should  accompany  the  normal  growth  of  PV,  but  it  may  be  possible  sometime  in  the 
next  decade  to  seriously  concentrate  on  this  problem.  Our  confidence  in  PV  arises 
from  this  combination  of  widely  accepted  and  used  higher  level  theorems  with 
mechanized  lower  level  proof  checking,  mediated  by  human  creativity  in 
organizing  proofs. 

breakthrough  6 7 he  management  of  complexity  of  PI ' systems  and  the  design 
for  synergistic  human-machine  interaction. 

1 he  complexity  problem  exists  for  current  systems  which  arc  nowhere  near  their 
end  goals.  One  system  which  works  extremely  well  for  users  other  than  its 
designers  will  show  the  direction  for  other  PV  systems.  The  sustained  support  of 
present  PV  projects,  which  are  well  aware  of  this  problem  and  headed  in  this 
direction,  should  be  sufficient,  especially  when  more  users  gain  access  to  the 
systems. 

d.  I n-'FCTS  OF  PV  SOI  UTION/RREAKTf I ROUGHS 
ON  COMPUTING  IN  THE  1980s 

The  effects  of  PV  must  be  separated  into  the  tangibles  and  intangibles. 

(Intangible)  t.ffect  T The  education  of  programmers  can  be  vastly  improved  and 
accelerated. 
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Those  of  us  associated  with  education  constantly  see  the  improving  quantity  and 
quality  of  material  taught  in  courses.  For  example,  basic  data  structure  material 
that  was  unknown,  in  a systematic  way,  by  I’h.D.  graduates  of  the  early  1970's  is 
now  taught  routinely  to  college  freshmen.  The  impact  of  PV  may  be  to  sort  out  this 
basic  programming  material,  organize  it  more  systematically,  and  present  it  more 
coherently.  Programs  with  invariant  assertions  are  no  more,  and  probably  less, 
mysterious  than  unassorted,  and  often  unspecified,  programs.  It  is  likely  that  the 
textbooks  of  the  1980s  will  routinely  use  program  proving  ideas  without  any 
special  fuss  about  verification.  Then  more  people  will  be  able  to  read  and  perform 
proofs,  thereby  increasing  the  validity  of  PV.  li  light  of  the  productivity 
variations  between  programmers,  simply  teaching  more  programmers  more 
in  "••dial  should  improve  the  overall  quality  of  programming. 

( Intangible ) Effect  2:  The  standards  of  quality  in  programs  and  programmers 
should  improve. 

There  can  be  little  doubt  that  software  quality  has  been  low  both  because  little 
more  was  expected  and  because  low  quality  was  acceptable.  PV  makes  very  high 
demands  on  the  quality  of  programs  and  reveals  deficiencies.  If  at  least  some 
"perfect"  programs,  in  every  sense  of  their  quality,  can  be  produced  and  widely 
disseminated,  then  perhaps  the  quest  for  perfection  will  be  more  broadly  sought, 
especially  if  perfection  turns  out  to  pay  off.  An  example  in  this  direction  is  the 
Unix  operating  system,  which  has  gained  widespread  use  simply  because  it  is  clean, 
comfortable,  and  reliable,  if  not  all-purpose  and  fancy. 

(Tangible)  Effect  3:  The  construction  and  maintenance  of  programs  tuill  rely 
greatly  on  format  reasoning,  although  not  always  formal  proofs. 

Program  proving  as  pure  verification  separated  from  construction  will  probably 
disappear.  This  may  leave  testing  as  the  primary  mode  of  verification,  both  as 
confirmation  and  exhibition  of  program  quality.  If  maintenance — that  is,  the 
fixing  of  deficiencies  as  they  are  recognized  plus  the  adjustment  of  function  to 
meet  new  demands — is  as  expensive  as  figures  seem  to  indicate,  then  PV  may  pay 
off  most  here.  Even  if  proofs  are  not  performed  or  are  not  reliable  in  the  sense  of 
verification,  they  demand  that  programs  be  fully  specified  and  fully  documented 
in  the  form  of  assertions.  Such  assertions  may  guide  a new  form  of  maintenance, 
i.e.,  systematic  modification  preserving  correctness  as  stated  in  assertions.  As 
pointed  out  in  [ltalzer76],  it  may  be  feasible  to  shift  maintenance  from  the  level  of 
concrete  code  up  to  appropriate  abstract  levels,  rcimplementing  when  modification 
is  necessary. 

(Tangible)  Effect  4:  Significant  sized  programs  will  be  proved,  albeit  a! 
considerable  expense. 
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Wc  will  probably  sop  sonio  programs  of  hundreds  of  linos  of  code  selected  from  real 
applications  being  proved  before  I960.  For  programs  intended  for  wide 
distribution  and  critical  applications,  the  expense  will  be  fully  justifiable  and  will 
have  to  be  borne.  The  challenge  will  be  to  reduce  the  expense  for  full-scale  proofs 
and  to  appropriately  compromise  expense  with  uncertainty  for  scaled-down  proofs. 

(Tangible)  Effect  5 Standardized  components  tuill  be  built  and  verified  and 

used 

The  long-term  goal  of  software  engineering  includes  the  development  of 
off-the-shelf  components  for  many  common  tasks.  This  has  required  specification 
techniques  so  that  components  can  be  selected  and  composed,  verification 
techniques  so  that  the  components  can  be  trusted,  theory  and  experience  to  show 
what  should  be  standardized,  and  adaptation  techniques  so  that  components  can  be 
used  in  many  ways  in  many  environments.  Theory  and  practice  seem  to  be 
reaching  the  point  where  this  goal  can  be  a reality. 

Of  course,  the  ultimate  effect  may  be  chaos  or  destruction.  That  we  can 
verify  software  to  a high  degree  of  certainty  docs  not  mean  it  will  fit  well  into 
human  social  and  economic  systems.  One  cannot  help  but  feel  the  accelerating  use 
or  computing  systems:  electronic  fund  transfer,  maintenance  of  power  facilities, 
monitoring  of  real  time  systems,  electronic  mail,  speech  communication,  home 
computers  for  everyone's  daily  activities,  hospital  patient  monitoring  systems,  data 
bases,  etc.  While  computer  scientists  cannot  solve  the  issues  involved  with  the 
disparate  uses  of  computers,  perhaps  we  should  try  to  integrate  our  painful 
experience  with  fallibility  with  the  social  and  economic  systems  that  will  use  our 
results. 


Appendix 

hxnmplc:  Variations  of  Binary  Tree  Traversal 


l’u rposc:  This  example  Is  Intended  to  show  an  overall  generality  and  variety  of 
techniques  greater  than  commonly  seen  in  the  literature  on  program  verification. 
It  emphasizes  aspects  of  the  correctness-based  theory  of  programs  described  in  the 
paper,  specifically  several  types  of  theorems: 

(1)  Schemas  (partially  interpreted  programs)  for  an  iterative  version  of  a 
special  recursive  function  and  the  interpretation  of  this  schema  to  the 
more  specific,  but  still  generally  described,  task  of  preorder  tree  traversal. 

I'm  the  purpose  of  comparison  with  previous  publications  [London77a, 
liurstaH7d].  the  trees  will  be  traversed  to  count  the  "tips"  and  the  "leaves." 

(7.)  Several  transformations  which  allow  alteration  of  conditional  statement 
and  loop  structure. 

(3)  A "ghost  variable"  theorem  which  allows  deletion  of  a variable  used  the 
schema  once  it  has  been  related  to  specific  reasons  for  performing  the 
traversal. 

Associated  with  these  theorems  are  natural  methods  for  proving  programs: 

( 1 ) liy  Instantiation  of  proved  schemas  to  concrete  programs 

(?.)  15y  transformation  to  transfer  correctness,  with  little  additional  proof 
effort,  from  one  form  to  another  one  more  desirable  for  non-correctness 
reasons,  say  optimization  or  implementation  within  a restricted  set  of 
language  constructs. 

lhc..p  1 igher  level  methods  all  rely  on  the  old,  familiar  invariant  assertion 
method,  but  partially  shift  its  use  to  the  schema  and  transformation  level.  In 
addition,  by  following  this  paradigm  of  instantiational  and  transformational  proof, 

we  arc  able  to  conjecture  some  possible  "laws"  which  govern  the  forms  of 
assertions. 

Version  I Schema  for  iterative  version  of  a recursive  function 
Let 

F(x)=  if  p(x)  then  hO(x)  else  G(G(h  l(x?,F(h2(x))),  F(h3(x)) ) 
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whore  G is  an  associative  operation  with  identity  1G. 

K>i  ease  of  reading,  we  will  write  G as  an  infix  operator,  symbolized  by  ® 
I'(x)=  if  p(x)  then  h0(x)  else  h 1 (x)  ® F(h2(x))  ® F(h3(x)) 

An  iterative  program  for  this  function  is 

declare  S:Stack  (with  the  usual  operations 

PushStack.PopStack.TopStack.CreateStack); 

X:  typo  1)1  with  operations  h2,h3  producing  results  of  typcDl 
and  operations  h 1 ,F  producing  results  of  type  1)2 
and  ® an  operation  on  type  D2  producing  a D2 
Aec:  = lG;  X:=X';  S:=CreatcStack; 
lOfUQ  £L5  Hx')=  Acc  ® F(x)  ® UnravelStack(S) 
wh [le  ~p(x ) or  S/CreateStack  do 
if ~p(x) then 
Aec:=  Acc  ® h l(x)j 
S:-rus)iStack(S,h3(x)){  X:=h2(x) 

rise 

Ar:c:=  Acc  « h0(x); 
x:=TopStack(S);  S:=PopStack(S) 
repeat; 

Acc:=  Acc  ® h0(x); 
assert  F(x')=Acc 
where  UnravelMack(S)= 

if  S=CreateStack  then  IG  else  F TopStack(S))  ® UnravelStack(PopStack(S)) 

The  proof  rule  for  the  loop  is 
P^A,  Aa~B3Q,  AaB{S}A 


PQoop  as  A while  B do  S rcpeaOQ 


The  proof  of  this  program  is  relatively  easy  using  the  standard  inductive  assertion 
method.  Note  that  the  only  reasoning  necessary  or  available  for  the  proof  is  logic, 
the  associativity  property  of  ® and  "stack  algebra"  [Guttag76], 

Version  2 Instantiation  of  the  schema  to  tree  traversal 


Now,  assume  type  1)1  is  a binary  tree  of  the  usual  form,  either  nil  or  containing  left 
and  right  subtrees,  denoted  Lcft(t)  and  Right(t),  which  are  also  trees.  Let  the  type 
H2  be  sequences,  denoted  <...>  with  catenation  denoted  P>.  In  the  above  schema,  F 
can  be  instantiated  to  give  a list  of  subtrees,  either  with  or  without  nils,  by 
respectively 

Nodes(t)=  if  t=nil  then  <>  else  <t>  Nodes(Left(t))  fo  Nodes(Right(t)) 
or 

Subtrecs(t)=  if  t=nil  then  <nil>  else  <t>  Subtrees(Left(t))  p>  Subtrees(Right(t)) 
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line  ® Is  P,  the  catenation  operator  between  sequences 

hO(t)  is  <>,  the  empty  sequence,  for  Nodes  and  <nil>  for  Subtrees 
h l(t)  is  <t>,  h2(t)  is  Left(t),  h3(t)  Is  Right(t)  for  both  Nodes  and  Subtrees 
Therefore  the  following  program  computes  Nodes 
NodeList:=<>;  t:=T;  S:=CreateStack; 

loop  ns  Nodes(T)=  Nodel.ist  P Nodes(t)  P UnravelStack(S) 
while  t/nil  or  S/CrcateStack  do 
(**)  coinnient  statement  to  be  added  here; 
if  t/ nil  then 

NodeList:=  NodcList  P <t>; 

S:=I’ushStack(S,Right(t));  t:=Left(t) 

else 

(*)  NodeI.ist:=NodeList  P <>; 

t:=TopStack(S);  S:  = PopStack(S) 
fi 

repent 

(*)  NodeI,ist:  = NodcList  P <>; 
assert  Nodes(T)=NodeList; 

The  program  for  Subtrees  differs  only  In  the  lines  (*)  as  NodcListr^NodeList  P <nil>. 

V ci  sion  3.  Inclusion  of  counting  operations  during  traversal 

Now  suppose  we  want  to  traverse  the  tree  in  order  to  count  something 
about  its  subtrees,  say 
le.if(t)=  (t/nil)ALcft(t)=nilARight(t)=nil 
or 

tip(t)=  (t=nil) 

We  can  do  so  by  inserting  C:=0  before  the  loop  and  make  line  (**),  respectively 
if  t/nilAi,eft(t)=nilAlhght(t)=nil  then  C:=C+1  fi  in  the  Nodes  program 
if  t=nil  then  C:=C+1  fi  in  the  Subtrees  program 
and  proving  the  additional  loop  assertions 
C- Count(Nodcl.ist) 
where 

Coimt(NodeList)=if  NodeList=<>  then  0 

else  (if  q(l.nst(NodcList))  then  1 else  0)  + Count(OthcrThanLast(NodeList)) 
with  q respectively  leaf  and  tip.  For  q as  leaf,  this  gives 
Nodel,ist:~<>;  t:=T;  S:=CreatcStack; 

C:-0; 

loop  ns  Nodcs(T)=  Nodel.ist  P Not’es(t)  P UnravelStack(S) 

AC=Count(NodcList) 
while  t/nil  or  S/CreateStack  do 
if  t/nilAhef t(t)=nilAllight(t)=nil  then  C:=C+1  fi; 
i£  t/nil  then 
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NodeList:=  NodeList  fo  <t>; 

S:=PushStack(S,Hight(t));  t:=Left(t) 

Cl.SP 

NodeList:=NodeList  <>; 
l:=TopStack(S);  S:=PopStack(S) 
repeat 

NodeList:=  NodeList  P>  <>; 

assort  Nodcs(T)=NodcList  a C=Count(NodeList) 

Version  4:  Optimization  of  the  leaf-counting  version 

Now  we  will  go  through  a long  string  of  optimizing  transformations, 
first,  we  move  the  counting  operation  inside  the  if-then-clse 


Pari  of  program  to  be  replaced 

P{  if  II 1 aB2  then  SI  fi; 

II 15 1 then  S2  else  S3  fi  }Q 
-=> 

Replacing  part 

l'{  if  111  then 

if  112  then  S 1 fi; 

S2; 

else  S3  fi 

0} 

Sufficient  conditions  for  correctness  preservation 
when  I'aHIaup,  {SI } B1  that  is,  SI  docs  not  change  B1 


because  the  t/nil  is  preserved  over  C:=C+1 

Next,  observe  that,  stacks  usually  being  finitely  implemented  and  therefore  subject 
to  overflow,  we  might  want  to  avoid  putting  nil  Right(t)  on  the  stack,  so  we 
introduce  the  statement  if  Hight(t)/nil  then  S:=PushStack(S,Right(t»  fi;  using  the 
transformation 


1’{S1  }Q  = = > P{jf  R then  SI  fi}0 
when  l’A-vR  o Q 


In  this  transformation,  the  strongest  precondition  for  SI,  P,  is 
Nodes(T)= NodeList'  p>  Nodes(t)  <w  UnravelStack(S) 
AC'=Count(NodeList')  A(t/nil  or  S/CrcateStack)A(t7'nil) 
a(C=  if  Left(t)- nil  a Right(t)=nil  then  C'+l  else  C') 
ANodeList^NodeList'  fi>  <t> 

Ji  is  Right(t)=nil  and  0,  the  weakest  necessary  postcondition,  is 
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Nodes(T)=NodeList  fa  Nodes(Left(t))  fa  UnravelStack(S) 

AC=Count(NodeList) 

h01ds  because  Nodcs(t)=<t>  n Nodes(Lert(t))  n 
invaria"  , a"d  Nodcs(H1«ht(,»=<>-  allows  us  to  prove  the  additions! 


NoNi  1 Nod csOnStack(S)=  ( S=CreateStack  or 

TopStack(S)/nilANoNilNodesOnStack(PopStack(S))  ) 

£,  ving  the  result  of  all  these  transformations  as 
NodrI,ist:=<>;  t:=T;  S:=CreateStack; 

C:  = 0; 


£ — N°des(T)=  NodcList  fa  Nodes(t)  fa  UnravelStack(S) 
AC=Cnunt(NodcIilst) 

AWoNi  1 NodesOnStack(S) 
wlrne  t/nil  or  S/CreateStack  do 
II  t/nil  then 

jf  l.pft(t)=nllAHif>ht(t)=nil  then  C.sC+1  f|; 

NodeList:=  NodeLlst  fa  <t>; 

if  l»ight(t)/nil  then  S:=PushStack(S,Right(t))  fi- 

t:=I.ef  t(t)  ~ 

C'lsq 

NodeList:=NodeList  fa  <>,- 
t:=TopStack(S);  S;=PopStack(S) 
fi 


re;  _nt 


NodeMst:=NodeList  fa  <>; 


awri  Nodes(T)=NodeLlst  AC=Count(NodeLlst) 

Or  course,  NodeLisl  is  unnecessary  in  this  program  so,  it  can  be  deleted 
the  Ghost  Variable  Theorem  [Gcrhart78] 
t:=T;  S:=CreateStack; 


using 


C:  = 0: 


JLQP  as  Nodel.ist:  ( Nodes(T)=  NodcList  fl>  Nodes(t)  P>  UnravelStack(S) 
aC=Cout  tf NodeLlst)  ANoNilNodesOnStack(S)  ) 
t/nil  ».i  ^/CreateStack  do 
if  l/nil  then 

II  Lef l(t)=nilARjght(t)=nil  then  C:=C+1  fi; 

If  Right(t)/nil  then  S:=PushStack(S,Right(t))  fi; 
t:=Left(t)  ~ 

else 

t:=Toj)Stack(S);  S.=PopStack(S) 
re 'peat 

assert  ?!NodeLi si  (Nodes(T)=NodeList  AC=Count(NodeList)) 

After  proving  the  dislributivity  of  Count  over  fa,  i.e., 
Count(arab)=Count(a)4Count(b),  the  assertions  may  be  reworked  to 
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Cou»t(Nodes(T))=C  + Count(Nodes(t))  + Count(UnravelStack(S)) 
and 

C=Couiit(Nodcs(T)) 

and  tho  ghost  variables  arc  all  gone.  Finally,  there  are  many  redundant  tests 
within  the  Program,  such  as  finding  Left(t)=nil  at  the  test  before  C:=C+1 , making  t 
be  I.efKt),  then  i Hiding  t nil  in  the  next  loop  traversal,  and  PopStacking  Right(old 
to  ocome  the  new  t,  which  can  be  shortened  to  t:=Right(t).  Removing  these 
tin  ant  tests  by  twisting  around  the  paths  of  the  program,  proving  that  the 
verification  conditions  for  the  new  paths  follow  from  those  for  the  paths  of  the 
previous  programs,  leaves  the  more  efficient,  but  uglier, 
t;=T;  S:=CreateStacki  C:=0; 

if  t/ ii i ] then 

I,:  Count(Nodes(T))=C  + Count(Nodos(t))  + Count(UnraveIStack(S)) 

At/nil  ANoNilNodesOnStack(S)j 
if  l.eft(t)=nil  then 
if  High t(t)=nil  then 
C:=C+  1 ; 

if  S-CreateStack  then  goto  Finish 
f±ie  t:=’J'opStack(S);  S:=PoPStack(S);  goto  L fi 
ejse  t:=Right(t)s  goto  L fi 
else 

if  Right(t)/nil  then  S:=PushStack(S,t);  t:=Left(t);  goto  L 
rfse  t:  = ],eft(t);  goto  L fj. 

LLi 

fi; 

Finish:  assert  Count(Nodes(T))=C; 

Notice  some  of  the  characteristics  of  this  program  derivation: 

( 1 ) It  is  formally  controlled.  If  a program  gets  into  the  wrong  form,  if  an  idea 
occurs  for  a new  and  better  form,  or  if  there  is  simply  a need  to  step  up  to 
more  complex  programs,  then  there  is  a bridge  to  systematically  transfer 
correctness  from  the  old  to  the  new  form,  which  seldom  requires  much 

new  pi  oof.  However,  the  process  is  tedious  and  requires  considerable 
program  rewriting. 

(?.)  1 ho  assertions  have  a clean  structure  which  breaks  into  various  parts: 

Node.s(’j')=...  the  dominant  clause,  describing  the  goal  of  the 
program 

NoNilNodesOnStack(S) ...  a property  which  assisted  a space 
optimization 

C=Count(NodeList)...  relates  a concrete  value  C to  an  abstract  value 
NodcList 


t/ ii  i 1 ...  a special  condition  picked  up  at  the  loop  star: 

1 , o,n  these,  wc  can  conjecture  some  possible  "laws"  of  assertions: 

(a)  Ghost  variables  represent  missing  abstractions, 

(b)  Assertion  clauses  may  be  classified  as  dominant,  by  association 
with  the  least  optimized  abstract  program,  or  optimizational,  by 
association  with  some  property  used  for  optimization, 

(c)  Assertion  clauses  may  be  proved  one  by  one  in  some  strategic 
order, 

(d)  Many  assertions  have  the  form  FinalResult=Current 
...YotToIk'Done  because  they  originate  from  functions. 

1 he  versions  1 and  2 can  be  reused  for  other  problems  and  other  orders  of  tree 
traversal  may  be  modeled  after  this  one.  The  price  for  this  generality  is  that 
the  instantiated  schemas  must  be  optimized.  There  is  a tradeoff  between 
finding  the  optimizing  transformations  and  the  supporting  assertion 

increments  and  finding  the  fully  optimized  final  program  and  the  assertions 
for  it. 

There  are  numerous  questions  about  this  approach: 

(a)  Are  the  laws  valid?  useful?  (How  do  wc  decide  this?) 

(b)  How  can  the  tedium  of  managing  multiple  versions  be  reduced? 

(c)  llow  hard  is  it  to  find  schemas?  Are  they  worth  the  effort?  What 
level  of  abstraction  provides  the  greatest  payoff?  For  example,  is 
there  a generalization  of  Version  1 from  which  tree  traversal  is  a 
direct  instantiation? 

(d)  How  hard  is  it  to  maintain  a catalog  of  schemas  and  transiormations? 
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